Search CORE

5 research outputs found

Assessing the Potential of Classical Q-learning in General Game Playing

Author: CB Browne
CJCH Watkins
CP Robert
D Silver
D Silver
H Wang
J Hu
J Méhat
M Genesereth
M Genesereth
M Świechowski
RS Sutton
V Mnih
Publication venue
Publication date: 14/10/2018
Field of study

\&

Stone, IJCAI 2007) in GGP. In this paper we implement Q-learning in GGP for three small-board games (Tic-Tac-Toe, Connect Four, Hex)\footnote{source code: https://github.com/wh1992v/ggp-rl}, to allow comparison to Banerjee et al.. We find that Q-learning converges to a high win rate in GGP. For the

\epsilon

-greedy strategy, we propose a first enhancement, the dynamic

\epsilon

algorithm. In addition, inspired by (Gelly

\&

Silver, ICML 2007) we combine online search (Monte Carlo Search) to enhance offline learning, and propose QM-learning for GGP. Both enhancements improve the performance of classical Q-learning. In this work, GGP allows us to show, if augmented by appropriate enhancements, that classical table-based Q-learning can perform well in small games.Comment: arXiv admin note: substantial text overlap with arXiv:1802.0594

arXiv.org e-Print Archive

Crossref

Leiden University Scholary Publications

Neural Learning of Heuristic Functions for General Game Playing

Author: C Bishop
D Silver
GB Huang
J Cirasella
M Świechowski
NY Liang
R Penrose
S Russell
V Mnih
YH Pao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

Institutional Research Information System University of Turin

Assessing the Potential of Classical Q-learning in General Game Playing

Author: CB Browne
CJCH Watkins
CP Robert
D Silver
D Silver
H Wang
J Hu
J Méhat
M Genesereth
M Genesereth
M Świechowski
RS Sutton
V Mnih
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/09/2019
Field of study

After the recent groundbreaking results of AlphaGo and AlphaZero, we have seen strong interests in deep reinforcement learning and artificial general intelligence (AGI) in game playing. However, deep learning is resource-intensive and the theory is not yet well developed. For small games, simple classical table-based Q-learning might still be the algorithm of choice. General Game Playing (GGP) provides a good testbed for reinforcement learning to research AGI. Q-learning is one of the canonical reinforcement learning methods, and has been used by (Banerjee & Stone, IJCAI 2007) in GGP. In this paper we implement Q-learning in GGP for three small-board games (Tic-Tac-Toe, Connect Four, Hex), to allow comparison to Banerjee et al. We find that Q-learning converges to a high win rate in GGP. For the ϵ" role="presentation" style="display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; overflow-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border-width: 0px; border-style: initial; position: relative;">ϵ-greedy strategy, we propose a first enhancement, the dynamic ϵ" role="presentation" style="display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; overflow-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border-width: 0px; border-style: initial; position: relative;">ϵ algorithm. In addition, inspired by (Gelly & Silver, ICML 2007) we combine online search (Monte Carlo Search) to enhance offline learning, and propose QM-learning for GGP. Both enhancements improve the performance of classical Q-learning. In this work, GGP allows us to show, if augmented by appropriate enhancements, that classical table-based Q-learning can perform well in small games.Computer Systems, Imagery and Medi

Crossref

Leiden University Scholary Publications

On-Line Parameter Tuning for Monte-Carlo Tree Search in General Game Playing

Author: A Benbassat
B Bouzy
CB Browne
EK Burke
GMJB Chaslot
GMJB Chaslot
J Fürnkranz
JPAM Nijssen
L Kocsis
L Kocsis
M Świechowski
MJW Tak
P Auer
R Coulom
R Coulom
S Ontanón
SM Lucas
Y Björnsson
Y Björnsson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Maastricht University Research Portal

Crossref

Comparison of wastewater treatment plants based on the emissions of microbiological contaminants

Author: A Carducci
A Pardakhty
A Pringle
A Roodbari
A Stobnicka
Anuar Kassim
C Tomasi
DC Blanchard
DJ O’Connor
E Brągoszewska
E Brągoszewska
E Brągoszewska
E Korzeniewska
G Gregová
Gabriel Andari Kristanto
H Bauer
H Soltani
J Mandal
J Thorn
JS Pastuszka
K Uhrbrand
M Fouladgar
M Michałkiewicz
M Sánchez-Monedero
M Yoosefian
MC Shravanthi
Michał Michałkiewicz
N Wéry
NL Fernando
PC Mouli
R Świechowski
RL Górny
RS Dungan
S Ahmadzadeh
S Ahmadzadeh
S Ahmadzadeh
S Ahmadzadeh
S Ahmadzadeh
S Fuzzi
S Mentese
S Niazi
S Sabariego
T Maki
W Adamus-Białek
Y Han
Y Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref